Iteration Aware Prefetching for Large Multidimensional Scientific Datasets
نویسندگان
چکیده
Most caching and prefetching research does not take advantage of prior knowledge of access patterns, or does not adequately address the storage issues inherent with multidimensional scientific data. Armed with an access pattern specified as an iteration over a multidimensional array stored in a disk file, we use prefetching to greatly reduce the number of disk accesses and partially hide the cost of read latency from the user. We assume the pattern of access is not known until runtime, in contrast to chunking methods which preprocess a file for a particular access pattern. Our results on plain unpreprocessed files are competitive with chunking. We can also apply our method to chunked files and achieve additional performance improvements.
منابع مشابه
Out-of-core visualization using iterator-aware multidimensional prefetching
Visualization of multidimensional data presents special challenges for the design of efficient out-of-core data access. Elements that are nearby in the visualization may not be nearby in the underlying data file, which can severely tax the operating system’s disk cache. The Granite Scientific Database System can address these problems because it is aware of the organization of the data on disk,...
متن کاملSCOUT: Prefetching for Latent Structure Following Queries
Today’s scientists are quickly moving from in vitro to in silico experimentation: they no longer analyze natural phenomena in a petri dish, but instead they build models and simulate them. Managing and analyzing the massive amounts of data involved in simulations is a major task. Yet, they lack the tools to efficiently work with data of this size. One problem many scientists share is the analys...
متن کاملSCOUT: Prefetching for Latent Feature Following Queries
Today’s scientists are quickly moving from in vitro to in silico experimentation: they no longer analyze natural phenomena in a petri dish, but instead they build models and simulate them. Managing and analyzing the massive amounts of data involved in simulations is a major task. Yet, they lack the tools to efficiently work with data of this size. One problem many scientists share is the analys...
متن کاملSpatial prefetching for out-of-core visualization of multidimensional data
In this paper we propose a technique called storage-aware spatial prefetching that can provide significant performance improvements for out-of-core visualization. This approach is motivated by file chunking in which a multidimensional data file is reorganized into multidimensional sub-blocks that are stored linearly in the file. This increases the likelihood that data close in the n-dimensional...
متن کاملContext-Aware Prefetching at the Storage Server
In many of today’s applications, access to storage constitutes the major cost of processing a user request. Data prefetching has been used to alleviate the storage access latency. Under current prefetching techniques, the storage system prefetches a batch of blocks upon detecting an access pattern. However, the high level of concurrency in today’s applications typically leads to interleaved blo...
متن کامل